Locally Differentially Private Heavy Hitter Identification

نویسندگان

  • Tianhao Wang
  • Ninghui Li
  • Somesh Jha
چکیده

The notion of Local Differential Privacy (LDP) enables users to answer sensitive questions while preserving their privacy. The basic LDP frequent oracle protocol enables the aggregator to estimate the frequency of any value. But when the domain of input values is large, finding the most frequent values, also known as the heavy hitters, by estimating the frequencies of all possible values, is computationally infeasible. In this paper, we propose an LDP protocol for identifying heavy hitters. In our proposed protocol, which we call Prefix Extending Method (PEM), users are divided into groups, with each group reporting a prefix of her value. We analyze how to choose optimal parameters for the protocol and identify two design principles for designing LDP protocols with high utility. Experiments on both synthetic and real-world datasets demonstrate the advantage of our proposed protocol.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Distinct Heavy Hitters for DNS DDoS Attack Detection

Motivated by a recent new type of randomized Distributed Denial of Service (DDoS) attacks on the Domain Name Service (DNS), we develop novel and efficient distinct heavy hitters algorithms and build an attack identification system that uses our algorithms. Heavy hitter detection in streams is a fundamental problem with many applications, including detecting certain DDoS attacks and anomalies. A...

متن کامل

Heavy Hitters and the Structure of Local Privacy

We present a new locally differentially private algorithm for the heavy hitters problem which achieves optimal worst-case error as a function of all standardly considered parameters. Prior work obtained error rates which depend optimally on the number of users, the size of the domain, and the privacy parameter, but depend sub-optimally on the failure probability. We strengthen existing lower bo...

متن کامل

Pan-private Algorithms: When Memory Does Not Help

Consider updates arriving online in which the tth input is (it, dt), where it’s are thought of as IDs of users. Informally, a randomized function f is differentially private with respect to the IDs if the probability distribution induced by f is not much different from that induced by it on an input in which occurrences of an ID j are replaced with some other ID k. Recently, this notion was ext...

متن کامل

Identifying Heavy-Hitter Flows from Sampled Flow Statistics

With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statist...

متن کامل

A Simple Mechanism for Throttling High-Bandwidth Flows

This letter presents BREATHe, a simple packet dropping scheme for identifying and throttling unresponsive or misbehaving highbandwidth flows during times of congestion. BREATHe is different from the existing active queue management techniques in that it uses heavy-hitter set analysis to identify highbandwidth flows rather than sampling or rate estimation. Specifically, BREATHe uses heavy-hitter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1708.06674  شماره 

صفحات  -

تاریخ انتشار 2017